1  8th  International  Command  and  Control  Research  and  Technology  Symposium 
C2  in  Underdeveloped,  Degraded  and  Denied  Operational  Environments 


MULTI-ENTITY  BAYESIAN  NETWORKS  LEARNING 
IN  PREDICTIVE  SITUATION  AWARENESS 


Topic  3:  Data,  Information  and  Knowledge 


Cheol  Young  Park*  [STUDENT] 
Kathryn  Blackmond  Laskey 
Paulo  Costa 
Shou  Matsumoto 


The  Sensor  Fusion  Lab  &  Center  of  Excellence  in  C4I 
The  Volgenau  School  of  Engineering 
George  Mason  University 
4400  University  Drive 
Fairfax,  VA  22030-4444 
(703)332-9921 


cparkf@masonlive.gmu.edu,  [klaskey,  pcosta]@gmu.edu,  smatsum2@masonlive.gmu.edu 


Point  of  Contact:  Cheol  Young  Park 
cparkf@masonlive.gmu.edu  and/or  (703)  332-9921 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 

1.  REPORT  DATE 

JUN  2013  2.  REPORT  TYPE 

4.  TITLE  AND  SUBTITLE 

Multi-Entity  Bayesian  Networks  Learning  in  Predictive  Situation 
Awareness 

6.  AUTHOR(S) 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

George  Mason  University, The  Sensor  Fusion  Lab  &  Center  of  Excellence 
in  C4I,4400  University  Drive, Fairfax, VA, 22030-4444 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

Presented  at  the  18th  International  Command  &  Control  Research  &  Technology  Symposium  (ICCRTS) 
held  19-21  June,  2013  in  Alexandria,  VA.  U.S.  Government  or  Federal  Rights  License 

14.  ABSTRACT 

Over  the  past  two  decades,  machine  learning  has  led  to  substantial  changes  in  Data  Fusion  Systems 
throughout  the  world.  One  of  the  most  important  application  areas  for  data  fusion  is  situation  awareness  to 
support  command  and  control.  Situation  Awareness  is  perception  of  elements  in  the  environment, 
comprehension  of  the  current  situation,  and  projection  of  future  status  before  decision  making.  Traditional 
fusion  systems  focus  on  lower  levels  of  the  JDL  hierarchy,  leaving  higher-level  fusion  and  situation 
awareness  largely  to  unaided  human  judgment.  This  becomes  untenable  in  today?s  increasingly  data-rich 
environments,  characterized  by  information  and  cognitive  overload.  Higher-level  fusion  to  support 
situation  awareness  requires  semantically  rich  representations  amenable  to  automated  processing. 
Ontologies  are  an  essential  tool  for  representing  domain  semantics  and  expressing  information  about 
entities  and  relationships  in  the  domain.  Probabilistic  ontologies  augment  standard  ontologies  with  support 
for  uncertainty  management,  which  is  essential  for  higher-level  fusion  to  support  situation  awareness. 
PROGNOS  is  a  prototype  Predictive  Situation  Awareness  (PSAW)  System  for  the  maritime  domain.  The 
core  logic  for  the  PROGNOS  probabilistic  ontologies  is  Multi-Entity  Bayesian  Networks  (MEBN),  which 
combines  First-Order  Logic  with  Bayesian  Networks  for  representing  and  reasoning  about  uncertainty  in 
complex,  knowledge-rich  domains.  MEBN  goes  beyond  standard  Bayesian  networks  to  enable  reasoning 
about  an  unknown  number  of  entities  interacting  with  each  other  in  various  types  of  relationships,  a  key 
requirement  for  PSAW.  The  existing  probabilistic  ontology  for  PROGNOS  was  constructed  manually  by  a 
domain  expert.  However,  manual  MEBN  modeling  is  labor-intensive  and  insufficiently  agile.  To  address 
this  problem,  we  developed  a  learning  algorithm  for  MEBN-based  probabilistic  ontologies.  This  paper 
presents  a  bridge  between  MEBN  and  the  Relational  Model,  and  a  parameter  and  structure  learning 
algorithm  for  MEBN.  The  methods  are  evaluated  on  a  case  study  from  PROGNOS. 


3.  DATES  COVERED 

00-00-2013  to  00-00-2013 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 


15.  SUBJECT  TERMS 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 

18.  NUMBER 

19a.  NAME  OF 

ABSTRACT 

OF  PAGES 

RESPONSIBLE  PERSON 

a.  REPORT 

unclassified 

b.  ABSTRACT 

unclassified 

c.  THIS  PAGE 

unclassified 

Same  as 
Report  (SAR) 

54 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


ABSTRACT 


Over  the  past  two  decades,  machine  learning  has  led  to  substantial  changes  in  Data  Fusion 
Systems  throughout  the  world.  One  of  the  most  important  application  areas  for  data  fusion  is 
situation  awareness  to  support  command  and  control.  Situation  Awareness  is  perception  of 
elements  in  the  environment,  comprehension  of  the  current  situation,  and  projection  of  future 
status  before  decision  making.  Traditional  fusion  systems  focus  on  lower  levels  of  the  JDL 
hierarchy,  leaving  higher-level  fusion  and  situation  awareness  largely  to  unaided  human 
judgment.  This  becomes  untenable  in  today’s  increasingly  data-rich  environments,  characterized 
by  information  and  cognitive  overload.  Higher-level  fusion  to  support  situation  awareness 
requires  semantically  rich  representations  amenable  to  automated  processing.  Ontologies  are  an 
essential  tool  for  representing  domain  semantics  and  expressing  information  about  entities  and 
relationships  in  the  domain.  Probabilistic  ontologies  augment  standard  ontologies  with  support 
for  uncertainty  management,  which  is  essential  for  higher-level  fusion  to  support  situation 
awareness.  PROGNOS  is  a  prototype  Predictive  Situation  Awareness  (PSAW)  System  for  the 
maritime  domain.  The  core  logic  for  the  PROGNOS  probabilistic  ontologies  is  Multi-Entity 
Bayesian  Networks  (MEBN),  which  combines  First-Order  Logic  with  Bayesian  Networks  for 
representing  and  reasoning  about  uncertainty  in  complex,  knowledge-rich  domains.  MEBN  goes 
beyond  standard  Bayesian  networks  to  enable  reasoning  about  an  unknown  number  of  entities 
interacting  with  each  other  in  various  types  of  relationships,  a  key  requirement  for  PSAW.  The 
existing  probabilistic  ontology  for  PROGNOS  was  constructed  manually  by  a  domain  expert. 
However,  manual  MEBN  modeling  is  labor-intensive  and  insufficiently  agile.  To  address  this 
problem,  we  developed  a  learning  algorithm  for  MEBN-based  probabilistic  ontologies.  This 
paper  presents  a  bridge  between  MEBN  and  the  Relational  Model,  and  a  parameter  and  structure 
learning  algorithm  for  MEBN.  The  methods  are  evaluated  on  a  case  study  from  PROGNOS. 


1  INTRODUCTION 

Over  the  past  two  decades,  machine  learning  has  led  to  substantial  changes  in  Data  Fusion 
Systems  throughout  the  world  [White,  1988;  Endsley,  1988;  Steinberg  et  ah,  1998;  Endsley  et 
ah,  2003;  Llinas  et  al.,  2004;  Linggins  et  ah,  2008].  One  of  the  most  important  application  areas 
for  data  fusion  is  Situation  Awareness  (SAW)  to  support  command  and  control  (C2).  Systems  to 
support  SAW  provide  information  regarding  the  present  or  future  situation.  This  information 
supports  situation  assessment  (SA)  and  is  exploited  for  C2  decision  making. 

According  to  the  most  common  cited  definition,  SAW  is  composed  of  three  processes; 
perception  of  elements  in  the  environment,  comprehension  of  the  current  situation,  and 
projection  of  the  future  status  [Endsley,  1988;  Endsley  et  al.,  2003].  Breton  and  Rousseau 
classified  26  SAW  definitions  and  identified  a  set  of  common  elements  of  SAW.  They  identified 
two  distinct  varieties,  which  they  termed  State-  and  Process-oriented  SAW.  In  their  definition, 
Process-oriented  SAW  focuses  on  the  link  between  the  situation  and  the  cognitive  processes 
generating  SAW,  while  State-oriented  SAW  focuses  on  the  link  between  the  situation  and  an 
internal  representation  of  elements  present  in  the  situation  [Breton  &  Rousseau,  2001]. 

In  contrast  to  traditional  SAW,  Predictive  Situation  Awareness  (PSAW)  emphasizes  the 
ability  to  make  predictions  about  aspects  of  a  temporally  evolving  situation  [Costa  et  al.,  2009; 


Carvalho  et  al.,  2010].  Traditionally,  decision  makers  are  responsible  for  the  higher-level  data 
fusion  in  which  they  use  the  results  of  low-level  fusion  to  estimate  and  predict  the  evolving 
situation.  PROGNOS  is  a  prototype  system  intended  to  address  the  need  for  higher-level  data 
fusion  [Costa  et  al.,  2009;  Carvalho  et  ah,  2010].  PROGNOS  provides  higher-level  fusion 
through  state-of-the-art  knowledge  representation  and  reasoning. 

The  PROGNOS  probabilistic  ontologies  employ  Multi-Entity  Bayesian  Networks 
(MEBN)  which  combines  First-Order  Logic  with  Bayesian  Network  for  representing  and 
reasoning  about  uncertainty  in  complex,  knowledge-rich  domains  [Laskey,  2008].  MEBN  goes 
beyond  standard  Bayesian  networks  to  enable  reasoning  about  an  unknown  number  of  entities 
interacting  with  each  other  in  various  types  of  relationships.  A  PSAW  system  must  aggregate 
state  estimates  provided  by  lower  level  information  fusion  (LLIF)  systems  to  help  users 
understand  key  aspects  of  the  aggregate  situation  and  project  its  likely  evolution.  A  semantically 
rich  representation  is  needed  that  can  capture  attributes  of,  relationships  among,  and  processes 
associated  with  various  kinds  of  entities.  Ontologies  provide  common  semantics  for  expressing 
information  about  entities  and  relationships  in  the  domain.  Probabilistic  ontologies  (PR-OWL) 
augment  standard  ontologies  with  support  for  uncertainty  management  [Costa,  2005].  PR-OWL 
2  extends  PR-OWL  to  provide  better  integration  with  OWL  ontologies  [Carvalho,  2011].  MEBN 
is  the  logical  basis  for  the  uncertainty  representation  in  the  PROGNOS  Probabilistic  ontologies. 


1.1  MEBN  for  PSAW 

Figure  1  shows  a  simplified  illustrative  example  of  a  problem  in  PSAW.  Our  goal  is  to  estimate  a 
vehicle  type  (e.g.,  tracked  and  wheeled)  of  a  target  object  and  a  degree  of  danger  (e.g.,  high  and 
low)  of  a  specific  region.  Figure  1  depicts  a  specific  situation  of  interest. 
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Figure  1.  Vehicle  Identification  Context  in  PSAW 


The  rectangles  in  Figure  1  mean  instances  of  entities.  Figure  1  expresses  two  relations  among 
entities.  An  inner  rectangle  which  is  shown  within  an  outer  rectangle  means  a  part  entity  of  an 
entity  represented  by  the  outer  rectangle,  so  it  means  composition  or  aggregation.  The  rectangle 
described  by  “Communicated  =  Y”  specifies  an  interconnected  relation. 

Our  system  has  been  provided  with  the  following  evidence.  At  Time  1,  a  weather  sensor 
has  reported  clear  weather  for  Region  1 . 1 .  A  geographic  information  system  has  reported  that 
Region  1.1  is  off-road  terrain.  Two  vehicle  objects,  VI  and  V2,  have  been  detected  by  an 
imaging  system,  which  has  reported  that  V2  is  tracked  and  has  failed  to  report  a  type  for  V 1 .  An 
MTI  sensor  indicates  that  both  vehicles  are  traveling  slowly.  A  COMINT  report  indicates 
communications  between  VI  and  V2.  Given  this  evidence,  we  want  to  know  the  object  type  of 
both  vehicles  and  the  danger  level  of  the  Region  1.1. 

We  might  consider  using  a  Bayesian  network  (BN)  [Pearl,  1988]  to  fuse  these  reports 
from  multiple  sources  and  answer  the  queries  of  interest.  Figure  2  shows  a  Bayesian  network  we 
might  use  for  this  problem. 


Each  box  in  the  figure  depicts  a  random  variable  (RV),  or  node.  A  label  at  the  top  of  the  box 
gives  a  name  for  the  RV,  the  labels  inside  the  boxes  indicate  its  possible  states,  and  the  numbers 
indicate  the  probability  of  the  state  given  our  current  evidence.  For  example,  the  RV 
VehicleType_v2  denotes  the  type  of  vehicle  2.  It  can  have  value  either  Wheeled  or  Tracked.  Arcs 
represent  direct  dependence  relationships.  For  example,  ImageTvpeReportjptl ,  the  type 
recorded  on  imaging  sensor  report  rptl,  depends  on  VehicleType_v2,  the  actual  type  of  v2,  the 
vehicle  being  observed  by  the  sensor.  RVs  for  which  we  have  evidence  are  shown  in  gray  and 
probabilities  are  set  to  100%  for  the  value  that  was  actually  observed.  For  example,  recall  that 
Region  1.1  was  off-road  terrain;  thus,  evidence  for  OffRoad  is  applied  to  the  node  Terrain- 
Type region  1  _1 .  Given  all  the  evidence  we  have  acquired,  we  assign  80%  probability  that  V 1  is 
tracked,  94%  probability  that  V2  is  tracked,  and  80%  probability  that  the  danger  level  in  Region 
1 . 1  is  high. 


Manual  construction  of  a  BN  like  Figure  2  is  feasible,  but  what  about  situations 
containing  hundreds  of  vehicles  and  reports?  For  such  situations,  MEBN  allows  us  to  build  up  a 
complex  BN  out  of  modular  pieces.  Figure  3  shows  a  MEBN  model,  called  an  MTheory,  that 
expresses  our  domain  knowledge  using  modular  components,  called  MFrags,  that  can  be 
composed  into  larger  models.  For  example,  the  ImageTvpeReport  MFrag  expresses  knowledge 
the  reported  type  from  an  imaging  sensor.  The  green  pentagons  are  context  RVs  that  express 
conditions  under  which  the  MEBN  fragment  is  valid:  obj  is  a  vehicle  located  in  region  rgn,  and 
rpt  is  a  report  about  obj.  The  gray  trapezoid  input  RVs  have  their  distributions  defined  in  other 
MFrags.  The  yellow  oval  resident  RV,  ImageTypeReport(rpt)  in  this  case,  has  its  distribution 
defined  in  this  MFrag,  and  its  distribution  depends  on  the  vehicle  type  of  obj  and  the  weather  of 
rgn. 


In  the  Vehicle  Identification  MTheory  in  Figure  3,  there  are  7  MFrags  such  as  Speed, 
ImageTvpeReport,  VehicleObject,  Danger,  Weather,  Region,  and  Reference  MFrag.  The 
MTheory  can  generate  many  different  BNs  specialized  to  different  situations,  as  depicted  in 
Figure  4  below.  Case  1  is  the  BN  of  Figure  2,  representing  two  vehicles  with  two  reports  in  a 
single  region  at  a  single  time.  Case  2  represents  five  vehicles  with  five  reports  in  a  single  region 
at  a  single  time.  Case  3  represents  are  five  vehicles  with  five  reports  in  a  single  region  at  5  time 
steps. 


MEBN  has  been  applied  to  situation  assessment  [Laskey,  2000;  Wright  et  al.,  2002;  Costa  et  al., 
2005].  Its  increased  expressive  power  over  ordinary  BNs  is  an  advantage  for  situation  assessment: 

“Militaiy  situation  assessment  requires  reasoning  about  an  unknown  number  of  hierarchically 
organized  entities  interacting  with  each  other  in  varied  ways  [Wright  et  al,  2002] .  ” 


1.2  Problem  Statement 

In  previous  applications  of  MEBN  to  situation  assessment,  the  MTheory  was  constructed 
manually  by  a  domain  expert  using  the  MEBN  modeling  process  called  Uncertainty  Modeling 
Process  for  Semantic  Technologies  (UMP-ST)  [Carvalho,  2011].  Manual  MEBN  modeling  is  a 
labor-intensive  and  insufficiently  agile  process.  This  paper  addresses  the  question  of  how  to 
move  beyond  this  manual  process.  In  particular,  we  focus  on  machine  learning  methods  in  which 
a  MEBN  theory  is  learned  from  observations  on  previous  situations. 

We  assume  the  availability  of  past  data  from  similar  situations.  Typically,  such  data  are 
stored  in  relational  databases.  Therefore,  we  consider  the  problem  of  how  to  use  data  stored  in  a 
relational  database  for  learning  an  MTheory.  We  take  the  standard  approach  of  decomposing  the 
learning  problem  into  parameter  and  structure  learning,  treating  each  of  these  in  turn. 


1.3  Scope 


This  paper  presents  a  basic  structure  and  parameter  learning  algorithm  for  MEBN  theories  and 
illustrates  the  method  on  synthetic  data  generated  from  the  PROGNOS  Simulaton.  We  assume: 

1 .  The  data  for  learning  are  stored  in  a  relational  database. 

a.  There  is  a  single  centralized  database  rather  than  multiple  distributed  databases. 

b.  We  do  not  consider  learning  from  unstructured  data. 

2.  The  database  contains  enough  observations  for  accurate  learning. 

3.  There  is  no  missing  data. 

4.  All  RVs  are  discrete.  Continuous  RVs  are  not  considered. 

5.  Learning  is  in  batch  mode.  We  do  not  consider  online  incremental  learning. 

6.  We  do  not  consider  the  problem  of  learning  functions  for  aggregating  influences  from 
multiple  instances  of  the  parents  of  an  RV. 

These  assumptions  will  be  relaxed  in  future  work. 


2  MULTI-ENTITY  BAYESIAN  NETWORK  AND  RELATIONAL  MODEL 

This  section  defines  Multi-Entity  Bayesian  Networks  (MEBN)  and  the  Relational  Model  (RM).  In 
Section  3,  we  present  the  MEBN-RM  Model,  a  bridge  between  MEBN  and  RM  that  will  allow 
data  represented  in  RM  to  be  used  to  learn  a  MEBN  theory. 


2.1  Multi-Entity  Bayesian  Network 

MEBN  represents  domain  knowledge  as  a  collection  of  MFrags.  An  MFrag  (see  Figure  6)  is  a 
fragment  of  a  graphical  model  that  is  a  template  for  probabilistic  relationships  among  instances  of 
its  random  variables.  Random  variables  in  an  MFrag  can  contain  ordinary  variables  which  can  be 
instantiated  for  different  domain  entities.  We  can  think  of  an  MFrag  as  a  class  which  can  generate 
instances  of  BN  fragments,  which  can  then  be  assembled  into  a  Bayesian  network. 

The  following  definition  of  MFrags  is  taken  from  [Laskey,  2008].  An  MFrag  can  contain  three 
kinds  of  nodes:  context  nodes  which  represent  conditions  under  which  the  distribution  defined  in 
the  MFrag  is  valid,  input  nodes  which  have  their  distributions  defined  elsewhere  and  condition  the 
distributions  defined  in  the  MFrag,  and  resident  nodes  with  their  distributions  defined  in  the 
MFrag.  Each  resident  node  has  an  associated  local  distribution,  which  defines  its  distribution  as  a 
function  of  the  values  of  its  parents.  The  RVs  in  an  MFrag  can  depend  on  ordinary  variables.  We 
can  substitute  different  domain  entities  for  the  ordinary  variables  to  make  instances  of  the  RVs  in 
the  MFrag. 

Figure  6  shows  the  Danger  MFrag  of  the  Vehicle  Identification  MTheory.  The  Danger 
MFrag  represents  probabilistic  knowledge  of  how  the  level  of  danger  of  a  region  is  measured 
depending  on  the  vehicle  type  of  detected  objects.  For  example,  if  in  a  region  there  is  a  large 
number  of  tracked  vehicles  (e.g.,  Tanks),  the  danger  level  of  the  region  will  be  high.  The  context 
nodes  for  this  MFrag  (shown  as  pentagons  in  the  figure)  show  that  this  MFrag  applies  when  a 
Vehicle  entity  is  substituted  for  the  ordinary  variable  obj,  a  Region  entity  is  substituted  for  the 
ordinary  variable  rgn,  and  a  vehicle  obj  is  located  in  region  rgn.  The  context  node  rgn  = 
Location(obj)  constrains  the  values  of  obj  and  rgn  from  the  possible  instances  of  vehicle  and 


region.  For  example,  suppose  vl  and  v2  are  vehicles  and  rl  is  a  region  in  which  only  vl  is  located. 
The  context  node  rgn  =  Location(obj)  will  allow  only  an  instance  of  (vl,  rl)  to  be  selected,  but 
not  (v2,  rl),  because  rl  is  not  the  location  of  v2.  Next,  we  see  the  input  node  VehicleType(obj), 
depicted  as  a  trapezoid.  Input  nodes  are  nodes  whose  distribution  is  defined  in  another  MFrag.  In 
Figure  6,  the  node  Danger _Level(rgn)  is  a  resident  node,  which  means  its  distribution  is  defined 
in  the  MFrag  of  the  figure.  This  node  Danger _Level{rgn)  might  be  an  input  node  of  some  other 
MFrag,  where  it  would  appear  as  a  trapezoid.  Like  the  graph  of  a  BN,  the  fragment  graph  shows 
statistical  dependencies.  The  local  distribution  for  Danger_Level(rgn )  describes  its  probability 
distribution  as  a  function  of  the  input  nodes  given  the  instances  that  satisfy  the  context  nodes.  In 
our  example,  the  argument,  rgn,  is  the  region  variable.  If  the  situation  involves  two  regions,  rl 
and  r2,  then  Danger _LeveI(rl)  and  Danger _Level(r 2)  will  be  instantiated.  The  local  distribution  is 
defined  in  a  language  called  Local  Probability  Description  (LPD)  Language.  In  our  example,  the 
probabilities  of  the  states,  high  and  low,  of  the  Danger _Level{rgn)  RV  are  defined  as  a  function  of 
the  values,  high  and  low,  of  instances  rgn  =  Location(obj )  of  the  parent  nodes  that  satisfy  the 
context  constraints.  For  the  high  state  in  the  first  if-scope  in  the  LPD  Language,  probability  value 
is  assigned  by  the  function  described  by  “1  -  1  /  CARDINALITY(ohy)”.  The  CARDINALITY 
function  returns  the  number  of  instances  of  obj  satisfying  the  if-condition.  For  example,  in  the 
LPD  expression  of  Figure  6,  if  the  situation  involves  three  vehicles  and  two  of  them  are  tracked, 
then  the  CARDINALITY  function  will  return  2.  We  see  that  as  the  number  of  tracked  vehicles 
becomes  very  large,  the  function,  “1  -  1  /  CARDINALITY  (obj)”,  will  tend  to  1. 
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Figure  6.  Danger  MFrag 
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Figure  7.  SSBN  of  Danger  MFrag  (given  vl,  v2,  and  v3  as  vehicle,  and  regionl  l  as  region) 


From  this  Danger  MFrag,  diverse  situation-specific  Bayesian  Networks  (SSBN)  can  be  generated 
depending  on  the  specific  entities  involved  in  the  situation.  For  example,  a  single  region  entity 
called  regionl_l  and  three  vehicle  entities  called  vl,  v2,  and  v3  will  give  rise  to  the  SSBN  in 
Figure  7,  with  the  conditional  probability  table  (CPT)  for  Danger _Level_regionl_l  as  shown. 

An  MTheory  is  a  collection  of  MFrags  that  defines  a  consistent  joint  distribution  over 
random  variables  describing  a  domain.  The  MFrags  fonning  an  MTheory  should  be  mutually 
consistent.  To  ensure  consistency,  conditions  must  be  satisfied  such  as  no-cycle,  bounded  causal 
depth,  unique  home  MFrags,  and  recursive  specification  condition  [Laskey,  2008].  No-cycle 
means  that  the  generated  SSBN  will  contain  no  directed  cycles.  Bounded  causal  depth  means  that 
depth  from  a  root  node  to  a  leaf  node  of  an  instance  SSBN  should  be  finite.  Unique  home  MFrags 
means  that  each  random  variable  has  its  distribution  defined  in  a  single  MFrag,  called  its  home 
MFrag.  Recursive  specification  means  that  MEBN  provides  a  means  for  defining  the  distribution 
for  a  RV  depending  on  an  ordered  ordinary  variable  from  previous  instances  of  the  RV.  The 
Vehicle  Identification  MTheory  described  above  is  a  set  of  consistent  MFrags  defining  a  joint 
distribution  over  situations  involving  instances  of  its  RVs. 


2.2  Relational  Model 

In  1969,  Edgar  F.  Codd  proposed  the  Relational  Model  (RM)  as  a  database  model  based  on  first- 
order  predicate  logic  [Codd,  1969;  Codd,  1970].  RM  is  the  most  popular  database  model.  A 
relational  database  (RDB)  is  a  database  that  uses  RM  as  its  basic  representation  for  data.  In  RM, 
data  are  organized  a  collection  of  relations.  A  relation  is  an  abstract  definition  of  a  class  of  entities 
or  a  relationship  that  can  hold  between  classes.  An  instance  of  a  relation  is  depicted  as  a  table  in 
which  each  column  is  an  attribute  of  the  relation  and  each  row,  called  a  tuple,  contains  the  value 
of  each  attribute  for  an  individual  entity  in  the  domain.  An  entry  in  the  table,  called  a  cell,  is  the 
value  of  the  attribute  associated  with  the  column  for  the  entity  associated  with  the  row.  A  key  is 
one  or  more  attributes  that  uniquely  identify  a  particular  domain  entity.  A  primary  key  for  a 
relation  uniquely  identifies  the  individual  entities  in  the  relation;  a  foreign  key  points  to  the 
primary  key  in  another  relation.  The  cardinality  of  a  relation  is  the  number  of  rows  in  the  table, 
i.e.,  the  number  of  entities  of  the  type  represented  by  the  relation.  The  degree  of  the  relation  is  the 
number  of  columns  in  the  table,  i.e.,  the  number  of  attributes  of  entities  of  the  type  represented  by 
the  relation. 


Attributes : 

Key,  TerrainType,  UpperRegion 


Primary  Key:  Foreign  Key: 

VehicleKey,  TimeKey  Region 


Domain:  Wheeled,  Tracked 


Degree:  3 


Y 

Relation:  VehicleObject,  Region,  Time,  Location 


Figure  8.  Example  of  Relational  Model 


Figure  8  shows  a  relational  model  for  the  vehicle  identification  example.  There  are  four  relations 
in  this  model:  VehicleObject,  Region,  Time  and  Location.  We  could  imagine  different  situations, 
each  with  different  vehicles,  regions,  etc.  Each  particular  situation,  like  the  one  depicted  in  Figure 
8,  corresponds  to  an  instance  of  this  relational  model.  The  instance  is  represented  as  a  table  for 
each  of  the  relations,  where  the  columns  represent  attributes  of  the  relation  and  the  rows  represent 
entities.  For  example  the  VehicleObject  relation  has  two  attributes:  Key,  which  uniquely  identifies 
each  individual  vehicle,  and  VehicleTvpe,  which  indicates  whether  the  vehicle  is  tracked  or 
wheeled.  The  VehicleKey  attribute  in  the  Location  relation  is  a  foreign  key  pointing  to  the  primary 
key  of  the  Vehicle  relation.  A  row  of  Location  represents  a  vehicle  being  located  in  a  region  at  a 
time. 


3  MEBN-RM  MODEL 

As  a  bridge  between  MEBN  and  RM,  we  suggest  the  MEBN-RM  Model,  specifies  how  to  match 
elements  of  MEBN  to  elements  of  RM.  We  describe  this  from  the  MEBN  perspective.  We  begin 
by  discussing  the  bridge  between  context  RVs  in  MEBN  and  elements  of  RM.  Next,  we  discuss 
the  bridge  between  resident  RVs  in  MEBN  and  elements  of  RM. 
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Figure  9.  Example  Tables 


Figure  9  is  used  as  an  example  for  the  next  sections.  It  extends  the  four  tables  of  Figure  8  by 
adding  a  fifth  Report  and  sixth  Communication  relation.  The  tables  VehicleObject,  Region,  Time, 
and  Report  are  called  entity  tables.  Each  of  these  represents  a  type  of  entity.  Each  primary  key  is 
a  single  column,  which  uniquely  identifies  the  entity.  For  example,  the  Key  column  in  the 
Vehicle  table  consists  of  identifiers  for  the  six  vehicles  in  our  situation.  The  Location  and 
Communication  table  is  called  a  relationship  table.  The  primary  key  of  a  relationship  table 
consists  of  two  or  more  foreign  keys  (in  this  case  ( VehicleKey ,  TimeKey )  for  the  Location  table). 
The  Location  table  represents  the  region  in  which  an  entity  is  located  at  a  time.  The  relations  and 
their  attributes  -  that  is,  a  set  of  empty  tables  -  is  called  the  schema  for  the  database.  A 
populated  set  of  tables  such  as  Figure  9  is  called  an  instance  of  the  schema.  It  is  clear  that  many 
different  instances  of  this  schema  are  possible,  each  corresponding  to  a  different  situation. 


3.1  Context  Node 

In  MFrags,  context  terms  (or  nodes)  are  used  to  specify  constraints  under  which  the  local 
distributions  apply.  Thus,  it  determines  specific  entities  on  an  arbitrary  situation  of  a  context. 

In  the  MEBN-RM  model,  we  define  four  types  of  data  structure  corresponding  to  context 
nodes:  Isa,  Value-Constraint,  Slot-filler,  and  Entity-Constraint  type. 


Type 

Name 

Example 

1 

Isa 

Isa(  VehicleObject,  obj  ),  Isa(  Region,  rgn  ), 

Isa(  Time,  t),  Isa(  Report,  rpt  ) 

2 

V  alue-C  onstraint 

VehicleType(  obj )  =  Wheeled 

3 

Slot-Filler 

obj  =  Reported  Object(  rpt ) 

4 

Entity-C  onstraint 

Communication  objl,obj2) 

Table  1.  Context  Node  Types  on  MEBN-RM  Model 


3.1.1  Isa  Type 

In  MEBN,  the  Isa  random  variable  (RV)  represents  the  type  of  an  entity.  In  a  RM,  an  entity  table 
represents  a  collection  of  entities  of  a  given  type.  Thus,  an  entity  table  corresponds  to  an  Isa 
random  variable  in  MEBN.  Note  that  a  relationship  table  whose  primary  key  is  composed  of 


foreign  keys  does  not  correspond  to  an  Isa  RV.  A  relationship  table  will  correspond  to  the  Entity- 
Constraint  type  of  Context  Node.  In  the  example,  the  table  of  VehicleObject,  Region,  Time,  and 
Report  are  entity  tables,  so  they  correspond  to  Isa  RVs  such  as  Isa(  VehicleObject,  obj  ), 
Isa(  Region,  rgn  ),  Isa(  Time,  t  ),  and  Isa(  Report,  rpt  ).  The  primary  key  of  an  entity  relation 
consists  of  the  entities  of  the  given  type  in  our  situation.  For  example,  vl,  . . .,  v6,  the  entries  in  the 
Key  attribute  of  the  Vehicle  relation,  denote  the  six  vehicles  in  the  situation  depicted  by  the  RM  of 
Figure  9. 


3,1.2  Value-Constraint  Type 

The  value  of  an  attribute  can  be  used  to  select  those  keys  which  are  related  to  the  value.  For 
example,  consider  the  VehicleObject  table  in  which  we  have  the  Vehicle  entity  with  the 
VehicleType  attribute.  The  instances  of  the  Vehicle  entity  are  denoted  by  the  primary  key  (e.g.,  vl, 
v2,  v3,  v4,  v5,  and  v6).  To  focus  on  a  case  of  the  entity  with  “Wheeled”  value  of  the  attribute,  we 
will  select  the  set  {vl,  v5}.  In  MEBN,  this  corresponds  to  the  context  RV  VehicleType  {obj)  = 
Wheeled.  In  this  way,  we  can  represent  subsets  of  entities  selected  on  the  basis  of  the  values  of 
given  attributes. 


3.1.3  Slot-Filler  Type 

Consider  the  Report  table  depicting  the  Report  entity  which  has  an  attribute  ReportedObject 
referring  to  a  foreign  key,  VehicleObject. Key.  The  VehicleObject. Key  in  the  Report  entity  is  an 
attribute  which  domain  is  the  key  of  the  Vehicle  entity  in  the  VehicleObject  table.  In  other  words, 
this  attribute  points  to  an  entity  of  type  Vehicle.  This  attribute  represents  the  vehicle  associated 
with  the  corresponding  report.  For  example,  from  the  first  row  of  the  table,  we  see  that  vl  is  the 
ReportedObject  for  the  report  rptl.  That  is,  rpt l  is  a  report  about  the  vehicle  vl.  We  call  this  a  slot 
filler  attribute,  i.e.,  vl  fills  the  ReportedObject  slot  in  the  rptl  report.  In  MEBN,  this  slot  filler 
relationship  is  expressed  by  vl  =  ReportedObject(rptl). 

The  foreign  key,  VehicleObject. Key,  is  not  a  primary  key  for  the  Report  table.  This  means 
it  is  allowed  to  have  a  “null”  value,  which  means  an  empty  cell  (i.e.  no  report  is  available  for  the 
vehicle).  The  intersection  set  of  the  Vehicle  and  Report  entity  will  be  {(vl,  rptl),  (vl,  rpt2),  (vl, 
rpt3),  (v2,  rpt4),  (v2,  rpt5),  (v2,  rpt6)}. 


3.1.4  Entity-Constraint  Type 

A  relationship  table  identifies  a  connection  among  entity  tables  by  composing  two  or  more  keys 
of  the  entity  tables.  For  example,  the  primary  keys  of  the  Communication  table  are  VehicleKeyl 
and  VehicleKey2  from  the  VehicleObject  table.  Composing  two  keys  expresses  a  relationship 
between  the  entities.  The  connection  between  entities  corresponds  to  the  Entity-Constraint  type  in 
MEBN-RM.  The  Entity-Constraint  node  Communication  {objl,  obj 2)  in  MEBN  expresses  a 
relation  on  vehicle  entities.  From  Figure  9,  we  see  that  this  relation  is{(vl,  v2),  (v2,  v3),  (v2,  v4), 
(v2,  v5),  (vl,  v4),  (vl,  v5)}.  This  relation  corresponds  to  the  set  of  pairs  of  communicating 
vehicles. 


3.2  Resident  Node 


In  MFrags,  Resident  Node  can  be  described  as  Function,  Predicate,  or  Formula  of  FOL.  MEBN 
allows  the  modeler  to  specify  a  probability  distribution  for  the  truth-value  of  a  predicate  or  the 
value  of  a  function.  Fonnulas  are  not  probabilistic,  and  are  defined  by  built-in  MFrags  [Laskey, 
2008].  As  noted  above,  RM  is  based  on  first-order  predicate  logic.  In  this  section,  we  describe  the 
correspondence  between  functions  and  predicates  in  FOL  and  relations  in  RM. 

In  FOL,  a  predicate  represents  a  true/false  statement  about  entities  in  the  domain.  It  is 
expressed  by  a  predicate  symbol  followed  by  a  list  of  arguments.  For  example, 
Communication(x,y)  is  a  predicate  that  expresses  whether  the  entities  indicated  by  the  arguments  x 
and  y  are  communicating.  In  MEBN,  this  predicate  corresponds  to  a  Boolean  RV  with  possible 
values  True  and  False.  In  RM,  we  express  a  predicate  as  a  table  in  which  the  primary  key  consists 
of  all  the  attributes.  These  attributes  are  the  arguments  of  the  predicate,  and  the  rows  of  the  table 
represent  the  arguments  for  which  the  predicate  is  true.  For  example,  the  six  rows  of  the 
Communication  relation  of  Figure  9  correspond  to  the  six  pairs  of  entities  for  which  the  predicate 
Communication  holds. 

In  FOL,  a  function  is  a  mapping  from  domain  entities  called  inputs  to  a  value  called  the 
output.  For  example,  the  function  V eh  id  e  Tvpei  o  bj )  is  a  function  that  maps  its  argument  to 
Wheeled  if  it  is  a  wheeled  vehicle  and  Tracked  if  it  is  a  tracked  vehicle;  ReportedOhject(rpt)  is  a 
function  that  maps  its  argument  to  the  object  being  reported  upon.  In  RM,  a  function  is 
represented  by  a  non-key  attribute  of  a  table.  It  maps  its  argument(s),  the  primary  key(s)  for  the 
relation,  to  the  output,  which  is  the  value  of  the  attribute.  For  example,  in  Figure  9,  the  argument 
of  the  function  VehicleType  is  the  primary  key  of  the  VehicleObject  relation,  and  the  output  is  the 
value  (either  Tracked  or  Wheeled)  of  the  VehicleType  attribute. 

Table  2  defines  the  relationship  between  elements  of  RM  and  MEBN. 


RM 

Resident  Node 

Attribute 

Function/  Predicate 

Key 

Arguments 

Cell  of  Attribute 

Output 

Table  2.  Function  of  MEBN-RM  Model 


4  THE  BASIC  PARAMETER  AND  STRUCTURE  LEARNING  FOR  MEBN 

This  section  presents  a  basic  structure  and  parameter  learning  method  for  learning  a  MEBN 
theory  from  a  relational  database. 


4.1  Basic  MEBN  Parameter  Learning 

Parameter  learning  for  MEBN  is  to  estimate  a  parameter  of  the  local  distribution  for  a  resident 
node  of  an  MTheory,  given  the  structure  of  the  MTheory  and  a  dataset  expressed  in  RM.  By 
structure,  we  mean  the  nodes,  arcs  and  state  spaces  in  each  MFrag,  and  the  parameters  of  the  local 
distributions  for  the  resident  nodes.  For  this  basic  algorithm,  we  use  Maximum  Likelihood 
Estimation  (MLE)  to  estimate  the  parameter.  Furthermore,  we  do  not  address  the  problem  of  the 
aggregating  influences  from  multiple  instances  of  the  same  parent.  We  assume  that  the  test  dataset 


is  well  modeled  by  an  MTheory  with  nodes  and  state  spaces  as  given  by  the  relational  database, 
and  that  the  local  distributions  are  well  modeled  by  the  chosen  parametric  family.  In  future 
research,  we  will  address  the  use  of  an  infonnative  prior  distribution  to  represent  a  priori 
infonnation  about  the  parameters. 

The  influence  aggregation  problem  occurs  when  there  are  multiple  instances  of  the  parents 
of  a  resident  node  that  satisfy  the  context  constraints  in  the  MFrag.  In  this  case,  a  domain  expert 
may  provide  knowledge  about  how  random  variables  are  aggregated,  and  an  aggregator  or 
combining  rule  may  be  used  for  estimating  the  parameter  [Getoor  et  ah,  2000;  Natarajan  et  ah, 
2009].  We  defer  consideration  of  aggregators  and  combining  rules  to  future  work. 


4.2  Basic  MEBN  Structure  Learning 

Structure  learning  for  MEBN  is  to  organize  RVs  into  MFrags  and  identify  parent-child 
relationships  between  nodes,  given  a  dataset  expressed  in  RM.  The  MFrags,  their  nodes  (context, 
input,  and  resident  nodes),  and  arcs  between  nodes  are  learned  (See  appendix  A).  The  initial 
ingredients  of  the  algorithm  are  the  dataset  (DB)  expressed  in  RM,  any  Bayesian  Network 
Structure  searching  algorithm  (BNSL  alg),  and  maximum  size  of  chain  (Sc).  We  utilize  a 
common  Bayesian  Network  Structure  searching  algorithm  to  generate  a  local  BN  from  the  joined 
dataset  of  the  RM. 

The  first  step  of  the  algorithm  is  to  create  the  default  MTheory.  All  keys  in  entity  tables  of 
the  DB  are  defined  as  entities  of  this  default  MTheory.  One  default  reference  MFrag  is  created, 
which  will  include  resident  nodes  used  for  context  nodes.  Because  context  nodes  also  are  random 
variables,  they  should  be  defined  an  MFrag  such  as  the  reference  MFrag.  Now,  using  both  entity 
and  relationship  tables,  the  MFrags,  their  nodes,  and  their  connections  are  learned.  There  are  three 
For-Loop  (#4,  #10,  and  #23in  appendix  A).  The  first  For-Loop  treats  all  tables,  while  the  second 
For-Loop  treats  the  joined  tables.  For  all  tables  of  the  DB,  the  dataset  for  each  table  is  retrieved 
one  by  one  and,  by  using  any  BN  structure  searching  algorithm  (BNSL_alg),  a  graph  is  generated 
from  the  retrieved  dataset.  If  the  graph  has  a  cycle  and  undirected  edge,  a  domain  expert  sets  the 
arc  direction  manually.  Based  on  the  revised  graph,  an  MFrag  is  created  by  using  createMFrag 
function  in  appendix  A.  In  the  second  For-Loop,  for  the  joined  tables,  data  associated  with 
relationship  tables  is  retrieved  until  the  maximum  size  of  chain  (Sc)  is  reached.  This  iteration 
continues  until  a  user-specified  maximum  size  of  chain  is  reached.  The  MFrags,  their  nodes,  and 
their  arcs  are  generated  in  the  same  way  as  described  in  the  previous  paragraph.  One  difference  is 
that  the  aggregating  influence  situation  should  be  detected  by  an  approach  called  Framework  of 
Function  Searching  for  LPD  (FFS-LPD)  which  will  detect  the  situation  and  provide  possible  LPD 
function  in  a  heuristic  approach.  FFS-LPD  can  be  realized  by  a  domain  expert  or  a  program.  In 
our  initial  research,  the  domain  expert  detects  the  aggregating  influence  situation  and  provides  a 
reasonable  LPD  function  having  aggregating  function  in  FFS-LPD  context  (An  automatic 
programmed  approach  is  being  researched).  After  checking  the  LPD  function,  if  any  nodes  of  the 
new  generated  graph  are  not  used  in  any  MFrags,  create  a  new  resident  node  having  the  name  of 
the  dataset  of  the  graph  on  the  default  reference  MFrag  and  a  new  MFrag  for  the  dataset.  If  not, 
add  make  edges  between  resident  nodes  corresponding  to  arcs  found  by  the  structure  learning 
algorithm.  If  there  is  an  arc  between  nodes  in  different  MFrags,  add  the  parent  node  as  an  input 
node  to  the  MFrag  of  the  child  node.  Lastly,  in  the  third  For-Loop,  for  all  resident  nodes  in  the 
MTheory,  LPDs  are  generated  by  MLE. 


5  CASE  STUDY 


As  noted  in  Section  1,  the  purpose  of  the  learned  MTheory  generated  by  the  presented  algorithm 
is  to  estimate  and  predict  a  situation  in  PSAW.  In  this  case  study,  a  learned  MTheory  is  evaluated 
by  evaluating  its  ability  to  predict  queries  of  interest. 

Our  case  study  uses  PROGNOS  (Probabilistic  OntoloGies  for  Net-centric  Operation 
Systems)  [Costa  et  ah,  2009;  Carvalho  et  ah,  2010].  PROGNOS  includes  a  simulation  which 
generates  simulated  ground  truth  infonnation  for  the  system.  The  simulation  generates  85000 
persons,  10000  ships,  and  1000  organization  entities  with  various  values  of  attributes  and  relations 
between  entities.  The  data  for  these  entities  are  stored  in  a  relational  database  which  includes 
three  entity  tables  (person ,  ship ,  and  organization )  and  two  relationship  tables  (ship_crews  and 
org  members).  The  ship_crews  table  has  a  paired  key  comprised  of  a  ShipKey  and  PersonKey, 
representing  the  persons  serving  as  crew  members  on  ships.  The  orgjnembers  table  has  a  paired 
key  comprised  of  an  OrganizationKey  and  PersonKey,  representing  membership  of  persons  in 
organizations.  A  ship  may  have  many  crew  members,  each  of  whom  may  be  affiliated  with 
several  organizations.  The  goal  of  PROGNOS  is  to  classify  ships  as  to  whether  they  are  ships  of 
interest.  In  our  case,  this  means  ship  associated  with  terrorist  activities.  The  classification  is  made 
given  evidence  about  the  attributes  of  the  entities.  For  example,  if  a  ship  had  a  crew  member  who 
has  communicated  with  a  terrorist,  the  ship  was  on  an  unusual  route,  and  it  was  unresponsive,  it  is 
highly  likely  that  the  ship  is  likely  to  be  a  ship  of  interest.  The  database  contains  an  attribute 
IsShipOflnterest  of  the  ship  table  representing  the  ground  truth  for  whether  it  is  a  ship  of  interest. 

To  evaluate  the  algorithm,  training  and  test  datasets  were  generated  by  the  simulation.  The 
algorithm  was  used  to  learn  an  MTheory  from  the  training  dataset  as  shown  in  Figure  1 1  (one  of 
SSBNs  from  the  learned  MTheory  is  shown  in  Figure  12).  In  the  MTheory,  a  total  of  four  MFrags 
were  generated.  There  is  the  default  reference  MFrag,  the  orgjnembers  MFrag  from  the 
org  members  relationship  table,  the  person  MFrag  from  the  person  entity  table,  and  the  ship 
MFrag  from  the  ship  entity  table.  The  orgjnembers  and  ship  crews  input  nodes  came  from  the 
orgjnembers  and  shipjzrews  relationship  tables.  After  learning  the  MTheory,  the  test  dataset  was 
used  to  evaluate  the  MTheory.  First,  a  test  case  from  the  test  dataset  was  retrieved.  Because  a  state 
of  the  IsShipOflnterest  variable  is  our  concern,  data  from  a  ship  in  the  test  dataset  was  retrieved. 
Based  on  the  ship  data,  other  related  data  were  retrieved.  All  of  these  were  combined  to  make  the 
test  data.  For  example,  if  a  ship  was  connected  to  3  persons  and  each  of  the  3  persons  was 
associated  with  3  organizations,  then  9  rows  of  a  joined  table  were  retrieved  as  one  test  case. 
Using  this  test  case,  a  SSBN  was  generated  from  the  learned  MTheory.  The  context  of  the  SSBN 
corresponds  to  the  context  of  the  test  case.  For  example,  using  the  previous  test  case  example,  1 
ship,  3  person,  and  9  organization  entities  are  used  for  generating  a  SSBN.  After  the  SSBN  is 
generated,  the  IsShipOflnterest  node  which  was  Boolean  was  queried  given  several  leaf  nodes  of 
the  SSBN  with  values  of  the  leaf  nodes  retrieved  from  the  test  data.  The  queried  probability  result 
was  stored  in  an  array.  This  retrieving  and  querying  process  continued  until  all  ships  were  treated. 

For  each  of  the  SSBNs  generated  from  the  test  data,  and  for  each  instance  of  the 
IsShipOflnterest  RV  in  the  SSBN,  the  probability  of  the  IsShipOflnterest  RV  was  computed  given 
the  evidence  for  the  leaf  nodes.  The  accuracy  of  the  queried  probability  results  was  measured 
using  the  Receiver  Operating  Characteristic  (ROC)  Curve.  The  ROC  for  our  case  study  is  shown 
in  Figure  10.  The  area  under  the  curve  (AUC)  is  shown  in  Table  3.  The  learned  MTheory 
estimated  the  state  of  the  IsShipOflnterest  node  with  the  AUC,  0.897206546. 


Model 

AUC 

Learned  MTheory 

0.897206546 

Table  3.  AUC  of  Learned  MTheory 
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Figure  10.  ROC  of  Learned  MTheory 


6  DISCUSSION  AND  FUTURE  WORK 

This  paper  discussed  reasons  why  MEBN  is  a  useful  modeling  tool  for  PSAW  systems,  providing 
a  semantically  rich  representation  that  also  captures  uncertainty.  MEBN  was  the  core  logic  for  the 
probabilistic  ontologies  used  in  the  PROGNOS  prototype  PSAW  system.  The  original 
PROGNOS  probabilistic  ontologies  were  constructed  manually  with  the  help  of  domain  experts. 
This  manual  MEBN  modeling  was  labor-intensive  and  insufficiently  agile.  To  address  this 
problem,  we  developed  a  learning  algorithm  for  MEBN-based  probabilistic  ontologies.  To  enable 
learning  from  relational  databases,  we  presented  a  bridge  between  MEBN  and  the  Relational 
Model,  which  we  call  the  MEBN-RM  model.  We  also  presented  a  basic  parameter  and  structure 
learning  algorithm  for  MEBN.  Finally,  the  presented  method  was  evaluated  on  a  case  study  from 
PROGNOS. 

Although  we  provided  a  basic  MEBN  learning,  there  are  several  issues.  1)  Aggregating 
influence  problem;  how  to  learn  an  aggregating  function  in  an  aggregating  situation  where  an 
instance  child  random  variable  depends  on  multiple  instance  parents  which  is  generated  from  an 
identical  class  random  variable?  2)  Optimization  of  learned  MTheory;  how  to  leam  an  optimized 
structure  of  an  MTheory  without  losing  accuracy  of  query?  3)  Unstructured  data  learning;  how  to 
leam  unstructured  data  which  isn’t  derived  from  a  data  model?  4)  Continuous  random  variable 
learning;  how  to  leam  an  MTheory  which  includes  continuous  random  variables?  5)  Multiple 
distributed  data  learning;  how  to  leam  an  MTheory  from  data  in  multiple  distributed  databases?  6) 
Incomplete  data  learning;  how  to  approximate  parameters  of  an  MTheory  from  missing  data?  7) 
Learning  in  insufficient  evidence;  how  to  leam  an  MTheory  from  not  enough  observations?  8) 
Incremental  MEBN  learning;  how  to  leam  parameters  of  an  MTheory  from  updated  observations? 
There  remain  many  open  research  issues  in  this  domain.  Recently,  we  are  studying  about  the 
aggregating  influence  problem  and  continuous  random  variable  learning  in  PSAW. 


Figure  11.  Learned  PROGNOS  MTheory 
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Figure  12.  Generated  SSBN  from  Learned  PROGNOS  MTheory.  (_1  and  _0  in  the  state  of  the  node  means  true 
and  false  respectively.  The  letter  S,  O,  and  P  in  the  title  of  the  node  means  Ship,  Organization,  and  Person 

respectively.) 
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APPENDIX  A 


Algorithm  1:  Basic  Structure  Learning  For  MEBN 
Procedure  BSLMEBN  (  DB,  //  Relational  database 

BNSLalg  II  BN  Structure  Search  algorithm 
Sc  //  Maximum  size  of  chain 

) 

1  Mthe„ry  *—  create  a  default  MTheory 

2  M,heory  *—  add  entities  from  the  all  keys  in  the  tables  of  DB 

3  MFref*—  create  a  default  reference  MFrag 

4  for  /  =  1,  . . .  until  size  of  all  tables  in  DB 

5  Ti  <—  get  table  from  DB 

6  G,  <—  search  the  graphs  in  1)  using  BNSL  alg 

I  Gi  *—  revise  the  graph  to  ensure  no  cycle  and  undirected  edge 

8  if  Gi  ^  0  then 

9  MFi  =  createMFragl  G„  T,  M,iieor>) 

1 0  for  c  =  1 ,  . . .  until  sc 

I I  JT <—  joinTables(/)/L  c) 

12  for  i  =  1,  . . .  until  size  of  JT 

13  Gt*—  search  the  aggregating  graphs  using  FFS-LPD 

14  Gi*—  search  the  graphs  in  JT,  using  BNSLalg 

15  Gi*—  revise  the  graph  to  ensure  no  cycle  and  undirected  edge 

16  if  Gi  0  then 

17  fory  =  1 ,  . . .  until  size  of  G, 

1 8  if  any  nodes  in  G,y  is  not  used  for  any  MFrag  then 

1 9  MFrl,f  *—  create  the  resident  node  with  the  name  of  JT  on  MFref 

20  createMFrag(G„  JT,,  Mlheory ) 

21  else 

22  addEdges(G„  JT„  0) 

23  for  i=  1 ,  ...  until  size  of  all  resident  nodes  in  the  MTheory 

24  Tt,  <—  get  dataset  related  the  resident  node  i 

25  calculateLPD(R„  T,) 

26  return  MtHe01y 


Procedure  createMFrag  (  Go  //  List  of  Resident  Nodes 

To  //  dataset  of  table 

Mtheory  //  Mtheory 

) 

1  MF  *—  create  MFrag  using  the  name  of  To 

2  N  *—  get  the  nodes  of  Go  which  is  not  used  for  any  Mfrags  of  M,ileor v 

3  R  *—  create  the  resident  nodes  corresponding  to  N 

4  MF  *—  add  R  into  MF  with  ordinary  variables  related  with  R 

5  MF  *—  addEdges( Go,  To,  MF) 

6  Add  MFrag  into  Mtileor, 

1  return  MF 


Procedure  addEdges  (Go  II  List  of  Resident  Nodes 

To  //  dataset  of  table 

MF  //  the  target  Mfrag 

) 

1  for  i=  1 , . . .  until  the  size  of  the  edges  of  G0 


2  Np  <—  get  the  resident  node  corresponding  to  the  parent  node  of 

3  Nc  <—  get  the  resident  node  corresponding  to  the  child  node  of 

4  MFP  <—  get  the  MFrag  of  Np 

5  MFC  <—  get  the  MFrag  of  Nc 

6  if  MF  =  MFC  =  MFP  then 

I  MF  <—  add  edges  between  Np  and  Nc  using  £,• 

8  else 

9  if  MFP  ±  MF  then 

1 0  MFP  <—  create  the  input  node  which  was  the  context  node  of  MF  and  add  it  into  MFP 

I I  if  MFC  £  MF  then 

12  MFC  <—  create  the  input  node  which  was  the  context  node  of  MF  and  add  it  into  MFC 

13  MFC  <—  create  the  input  node  from  Np  and  add  it  into  MFC 

14  return  MF 


Procedure  calculateLPD  ( R  //  List  of  Resident  Nodes 

To  //  dataset  of  table 

) 

1  for  i  =  _ until  size  of  R 

2  Rj.  LPD  <—  calculate  default  probabilities  of  K,  using  To 

3  if  Ri  is  in  Many-to-One  connection  then 

4  Ri.LPD  <—  assigned  the  LPD  which  is  generated  by  FFS-LPD 

5  else 

6  Rt.  LPD  <—  calculate  the  conditional  probabilities  of  Rt 


Procedure  joinTables  ( DB ,  // Relational  database 

c  //  range  of  the  chain 

) 

1  RT  <—  get  the  relationship  tables  of  DB 

2  for  i  =  1 , . . .  until  size  of  RT 

3  jt  <—  join  all  related  tables  in  the  range,  c,  from  RTt 

4  JT  <—  add  jt  into  JT  except  the  jt  already  added 

5  return  JT 
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1.  Introduction 

Data  fusion-SAW-C2 

•  Data  Fusion 

•  Integration  Process  of  multiple  data  and  knowledge 

•  Situation  Awareness  (SAW) 

•  Perception 

•  Comprehension 

•  Projection 

•  Predictive  Situation  Awareness  (PSAW) 

•  Estimation  and  prediction  of  an  evolving  situation  over  time 


1.  Introduction 

An  example  of  PSAW  situation 
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What  is  the  type  of  the  VI  given  the  observations  ? 

What  is  the  danger  level  of  the  region  1.1  given  the  observations  ? 
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1.  Introduction 

Bayesian  Networks  for  the  example 
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Observations:  Terrain  Type  of  region  1.1 
Queries:  Vehicle  Type  of  VI 
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1.  Introduction 

Bayesian  Networks  for  the  example 
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Observations:  Terrain  Type,  Weather,  Image  of  V2,  Speed  of  VI  and  V 2 
Queries:  Vehicle  Type  of  VI,  Danger  level  of  region  1.1 


6 


1.  Introduction 

Bayesian  Networks  for  the  example 


GEORGE 


UNIVERSITY 


Observations:  Image  type  and  speed  for  VI  ~  V5  in  time  1  ~  5 
Queries:  Vehicle  Type  of  VI,  V2,  V3,  V4,  and  V5 
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1.  Introduction 

MEBN  Model(MTheory)  from  the  example 
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1.  Introduction 

SSBN  generation 
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Given  entities,  the  MTheory  can  generate  many  different  BNs 
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1.  Introduction 

A  Danger  MFrag 


Danger_MFrag 


Context  Node  (C)  -< 


Input  Node  (I) 


Resident  Node 


(rgn  =  Looation(obj)) 
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Context  node  represents  conditions  under  which  the  distribution  defined  in  the  MFrag  is  valid 
Resident  node  is  a  random  variable  containing  a  term  of  First  Order  Logic 
Input  node  is  an  imported  resident  node  from  other  MFrag 
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1.  Introduction 

Generated  SSBN  from  the  Danger  MFrag 
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Given  entities,  VI,  V 2,  V3,  and  Regionl.l,  the  above  situation-specific  Bayesian  Networks 
(SSBN)  is  derived  from  the  Danger  MFrag  with  the  conditional  probability  table  (CPT) 
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2.  Problem  Statement 

•  Old  approach 

•  Manual  MEBN  modeling 

•  Problem  of  Manual  MEBN  modeling 

•  labor-intensive 

•  insufficiently  agile  process 


3.  Basic  MEBN  Learning 


•  MEBN-RM(Relational  Model)  Model 

•  Basic  MEBN  Parameter  Learning 

•  Basic  MEBN  Structure  Learning 


3.  Basic  MEBN  Learning 

MEBN-RM  Model 


3.  Basic  MEBN  Learning 

Basic  MEBN  Parameter  Learning 
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To  estimate  the  parameter,  Maximum  Likelihood  Estimation  (MLE) 
is  used 
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3.  Basic  MEBN  Learning 

Basic  MEBN  Structure  Learning 


Optimal  MTheory 
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To  search  a  local  BN,  GES(Greedy  equivalence  search)  algorithm  is 
used 
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3.  Basic  MEBN  Learning 

Basic  MEBN  Structure  Learning  Algorithm 


RM 

Dataset 


/ - 'N 

Any  Bayesian 
Networks  Structure 
Algorithm 

V _ ) 


r 


v 


Basic  MEBN  Structure 
Learning  Algorithm 


Algorithm  1 :  Basic  Structure  Learning  For  MEBN 
Procedure  BSL_MEBN  (  DB,  II  Relational  database 

BNSL_alg  1 1  BN  Structure  Search  algorithm 
Sc  II  Maximum  size  of  chain 

) 

1  -V/rw,  *—  create  a  default  MTheory 

2  M, w,  •—  add  entities  from  die  all  keys  in  the  tables  of  DB 

3  MF,t)  —  create  a  default  reference  MFrag 

4  for  /  =  1.  ...until  size  of  all  tables  in  DB 

5  T,  *—  get  table  from  DB 

6  G,  *—  search  the  graphs  in  T,  using  BNSL_alg 

7  G,  «—  revise  die  graph  to  ensure  no  cycle  and  undirected  edge 

8  if  G,  ^  0  then 

9  MF,  =  createMFrag(G„  T„ 

10  for  c=  L  ...  until  sc 

11  JT «— joinTables(DB.  c ) 

12  for  i  =  l,...  until  size  of  JT 

13  G,*—  search  the  aggregating  graphs  using  FFS-LPD 

14  G,  *—  search  the  graphs  in  JT,  using  BNSL_alg 

15  G,  *—  revise  the  graph  to  ensure  no  cycle  and  undirected  edge 

16  ifGi^Othen 

17  for/ =  1. ...  until  size  of  G, 

18  if  any  nodes  in  Gv  is  not  used  for  any  MFrag  then 

19  MF,,/—  create  the  resident  node  with  die  name  of  JT,  on  MF„f 

20  cieateMFrag(G„  JT,.  Af*™y) 

21  else 

22  addEdges(G1,  JT,,  0) 

23  for  i  =  1 .  . . .  until  size  of  all  resident  nodes  in  the  MTheory 

24  Th  *—  get  dataset  related  the  resident  node  i 

25  calculateLPD(R,  T,.) 

26  return  „ 


GEORGE 


UNIVERSITY 


The  initial  ingredients  of  the  algorithm  are  the  dataset  expressed  in  RM 
and  any  Bayesian  Network  Structure  searching  algorithm 
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4.  Case  Study 


•  Generating  Training  and  Test  data 

•  Evaluating  MTheory 

•  Learned  MTheory 

•  Accuracy  of  P(  SOI(Ship  Of  Interest)  | 
Evidences) 


4.  Case  Study 

Generating  Training  and  Test  data 


Save  Map  |  Load  Map 


f  [EWorld 
<f  □Entity 

Q  ShipO 
Q  Shipl 

□  Ship2 
Q  Ship3 

□  Ship4 
Q  Ship5 
Q  Ship6 

□  Ship7 
Q  Ship8 
Q  Ship9 
Q  ShipIO 
Q  Shipl  1 
Q  Ship12 

□  Shipl  3 
Q  Ship14 

□  Shipl  5 
Q  Shipl 6 
Q  Ship17 
Q  Ship18 
Q  Shipl 9 
Q  Ship20 
Q  Ship21 
Q  Ship22 
Q  Ship23 
Q  Ship24 
Q  Ship25 
Q  Ship26 
Q  Ship27 

— - Pi  Qhin9B  , - 

<1  HI  1  |» 

Known  Data  I  Inferred  Data 


Query  Area  Query  Web  Server  Query  Batch  Area  Query 


Time 

NumberOfFishingShips 

NumberOfMerchantShips 

10:5 

35 

0 

Mission 

Departure 

Location 

Destinati ... 

Type 

Route 

Meeting 

Appeara... 

Size 

Crew 

Latitude 

Longitude 

Speed 

SOI 

State 

Searching 

Okha 

undefined 

Porbandar 

NavyShip 

Usual 

none 

APP1 

Big 

21.8381.. 

68.77725.. 

10.6472... 

0.0 

movingT... 

PROGNOS  Simulator 

Mode  Simulation 


o 


Training 

Dataset 


Test 

Dataset 


PROGNOS  Simulation  Module 


GEORGE 


UNIVERSITY 


PROGNOS  (Probabilistic  OntoloGies  for  Net-centric  Operation  Systems) 

PROGNOS  is  a  prototype  Predictive  Situation  Awareness  (PSAW)  System  for  the  maritime  domain 
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4.  Case  Study 

Evaluating  MTheory 


(CZZ^ 


Training 

Dataset 


Learning 

MTheory 


Generating 
Many  SSBNs 


Test 

Dataset 


Providing 
a  SSBN 


Calculating  accuracy 
of 

p(Ship  Of  Interest 
|  Evidences) 

Providing 
Ground  Truth 


GEORGE 
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By  calculating  accuracy  of  p(SOI  |  Evidences),  we  evaluate  the  parameter  of 
the  learned  MTheory,  but  we  didn't  evaluate  the  structure  of  the  MTheory 
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4.  Case  Study 

Learned  PROGNOS  MTheory 


ship 


GEORGE 


UNIVERSITY 


The  default  reference  ,  org_members,  person,  and  ship  MFrag  are  learned 
from  the  training  data  set 
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4.  Case  Study 

Generated  SSBN  from  Learned  PROGNOS  MTheory 


_1  and  _0  in  the  state  of  the  node  means  true  and  false  respectively 

The  letter  S,  0,  and  P  in  the  title  of  the  node  means  Ship,  Organization,  and  Person 
respectively 


UNIVERSITY 
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Evidences) 


4.  Case  Study 

Accuracy  of  P(SOI 


Model 

AUC 

Learned  MTheory 

0.897206546 

Table  3.  AUC  of  Learned  MTheory 


o 

> 


a 

3 

u 


1 


0.8 


0.6 


0.4 


0.2 


0 


0  0.2  0.4  0.6  0.8  1 

Fake  Positive  Rate 


Learned  MTheory 


Figure  10.  ROC  of  Learned  MTheory 


UNIVERSITY 


The  learned  MTheory  estimated  p(Ship  Of  Interest  |  Evidences)  with  the  area 
under  the  curve  (AUC),  0.897206546 
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5.  Conclusion 


•  Basic  MEBN  Learning 

•  MEBN-RM  Model 

•  MEBN  Parameter  Learning 

•  MEBN  Structure  Learning 

•  Current  Work 

•  Hybrid  random  variable  learning  in  PSAW 


Thank  you  for  viewing  our 
presentation! 
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Back  up  1 


There  remain  many  open  research  issues  in  this  domain 

1)  Aggregating  influence  problem;  how  to  learn  an  aggregating  function  in  an 

aggregating  situation  where  an  instance  child  random  variable  depends  on  multiple 
instance  parents  which  is  generated  from  an  identical  class  random  variable? 

2)  Optimization  of  learned  MTheory;  how  to  learn  an  optimized  structure  of  an  MTheory 

without  losing  accuracy  of  query? 

3)  Unstructured  data  learning;  how  to  learn  unstructured  data  which  isn't  derived  from  a 

data  model? 

4)  Continuous  random  variable  learning;  how  to  learn  an  MTheory  which  includes 

continuous  random  variables? 

5)  Multiple  distributed  data  learning;  how  to  learn  an  MTheory  from  data  in  multiple 

distributed  databases? 

6)  Incomplete  data  learning;  how  to  approximate  parameters  of  an  MTheory  from 

missing  data? 

7)  Learning  in  insufficient  evidence;  how  to  learn  an  MTheory  from  not  enough 

observations? 

8)  Incremental  MEBN  learning;  how  to  learn  parameters  of  an  MTheory  from  updated 

observations? 


l^nsEORGE  We  have  studied  about  the  aggregating  influence  problem  and  continuous  random 

Mason  variable  learning  in  PSAW  26 
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Back  up  2 

•  The  data  for  learning  are  stored  in  a  relational  database 

•  There  is  a  single  centralized  database  rather  than 
multiple  distributed  databases 

•  We  do  not  consider  learning  from  unstructured  data 

•  The  database  contains  enough  observations  for  accurate 
learning 

•  There  is  no  missing  data 

•  All  RVs  are  discrete 

•  Continuous  RVs  are  not  considered 

•  Learning  is  in  batch  mode 

•  We  do  not  consider  online  incremental  learning 

•  We  do  not  consider  the  problem  of  aggregating  influences 
from  multiple  instances  of  the  parents  of  an  RV 


4.  Background 

Relational  Model  Example 


Attributes : 

Key,  TerrainType,  UpperRegion 


/  \  Region'v 

Key 

TerrainType 

UpperRegion 

rl 

OffRoad 

null 

rl_l 

Road 

rl 

rl_2 

OffRoad 

rl 

r2 

OffRoad 

null 

r2_l 

OffRoad 

r2 

r2_l_l 

Road 

r2_l 

YehicleObject 

Key 

YehicleType 

vl 

Wheeled 

v2 

Tracked 

v3 

Tracked 

v4 

Tracked 

v5 

Wheeled 

v6 

Tracked 

1 


Domain:  Wheeled,  Tracked 

V _ _ 

Y 


Primary  Key:  Foreign  Key: 

VehicleKey,  TimeKey  Region 


Time 

Key 

PreviousTime 

tl 

null 

t2 

tl 

t3 

t2 

t4 

t3 

t5 

t4 

t6 

t5 

1  ^h^cation 

VehicleKey 

TimeKey 

Region 

vl 

1 

rl  ^ 

vl 

2 

rl 

vl 

3 

rl 

v2 

1 

r2_l 

v2 

2 

r2_l 

v2 

3 

r2_l 

Y 

Degree:  3 


Tuple: 

First  row 


Cardinality: 

>-  6 


J 


Relation:  VehicleObject,  Region,  Time,  Location 


GEORGE 


UNIVERSITY 


In  1969,  Edgar  F.  Codd  proposed  the  Relational  Model  (RM)  as  a  database  model  based 
on  first-order  predicate  logic  [Codd,  1969;  Codd,  1970] 
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4.  Basic  MEBN  Learning 

Example  of  MEBN  Structure  Learning 


Report 

rpt 

ImageTypeReort 

ReportedObject 

rptl 

Wheeled 

vl 

rpt2 

Wheeled 

vl 

rpt3 

Tracked 

vl 

rpt4 

Tracked 

v2 

rpt5 

Wheeled 

v2 

rpt6 

Tracked 

v2 

Vehicle 

obj 

VehicleType 

vl 

Wheeled 

v2 

Tracked 

v3 

Tracked 

v4 

Tracked 

v5 

Wheeled 

v6 

Tracked 

Local 

tion 

obj 

t 

rgn 

vl 

tl 

rl 

vl 

t2 

rl 

vl 

t3 

rl 

v2 

tl 

r2 l 

v2 

t2 

r2 l 

v2 

t3 

r2_l 

Region 

rgn 

TerrainType 

UpperRegion 

rl 

OffRoad 

null 

rl  1 

Road 

rl 

rl 2 

OffRoad 

rl 

r2 

OffRoad 

null 

r2 l 

OffRoad 

r2 

r2_l _ 1 

Road 

r2_l 

Entity  Table  Relationship  Table 


UNIVERSITY 


Ingredients  for  MEBN  Structure  Learning: 

Relational  Dataset,  Any  BN  Structure  Learning  Algorithm 
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4.  Basic  MEBN  Learning 

Example  of  MEBN  Structure  Learning 


Vehicle 

Region 

Report 

obj 

• 

VehicleType 

rgn 

• 

TerrainType 

• 

UpperRegion 

rpt 

ImageTypeReort 

ReportedObject 

vl\ 

\\\eeldd 

rl\ 

OfER^d 

V>» 

rptl 

Wheeled 

vl 

v2 

v  Traced  N' 

rl l 

\Road  ^ 

X 

rpt2 

Wheeled 

vl 

v3 

\^ack\d 

\  rl_2 

oXpad 

\  n  X 

rpt3 

Tracked 

vl 

v4 

Trabl^ecly 

v 

OffRoad^ 

,  null 

\  rpt4 

Tracked 

v2 

v5 

Wheel^dX 

r2\l 

OffRoad 

\  V 

rpt5 

Wheeled 

v2 

v6 

Tracked  X 

r2_l_\ 

Road 

Xh 

rptX 

Tracked 

v2 

\  Vehide.Mf  rag 

pu*A<otn.v#hi«l*)1 

/  > 

^  V*hKi*Typ*(obD 

■v _ 

_ J 

Local 

tion 

obj 

t 

rgn 

vl 

tl 

rl 

vl 

t2 

rl 

vl 

t3 

rl 

v2 

tl 

r2 l 

v2 

t2 

r2 l 

v2 

t3 

r2_l 

GEORGE 
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1.  For  every  entity  Table,  generate  MFrags 

2.  Graph  is  derived  by  the  BN  structure  learning  Algorithm 
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4.  Basic  MEBN  Learning 

Example  of  MEBN  Structure  Learning 


Region 

rgn 

TerrainType 

UpperRegion 

rl 

OffRoad  ■ 

null 

\rl l 

Road 

rl 

rT? 

OffRoad 

rl 

r2  x 

s  OffRoad 

\null 

r2 l 

bffRoad 

\2 

r2_l _ 1 

Road 

r2\l 

Vehicle 

obj 

VehideType 

vl 

Wheele^N 

v2 

Tracked 

v3 

Tracked 

v4 

Tracked 

v5 

Wheeled 

v6 

Tracked 

Loca 

tion 

oir 

t 

rgn 

vl 

tl 

rl 

vl 

t2 

rl 

vl 

t3 

rl 

v2 

tl 

r2 l 

v2 

t2 

r2 l 

v2 

t3 

r2_l 

Report 

rpt 

ImageTypeReort 

ReportedObject 

rptl 

Wheeled 

vl 

rpt2 

Wheeled 

vl/ 

rpt3 

Tracked 

rpt4 

Tracked 

/  v2 

rpt5 

Wheeled/ 

v2 

rpt6 

Tracked 

v2 

Vehicle  -  Location  -  Redon 

G 

obj 

ran 

VehicleTvpe 

TerrainType 

Vl 

rl 

Wheeled 

OffRoad 

v2 

r2 l 

Tracked 

OffRoad 

•  .  . 

•  •  ■ 

•  •  . 

•  .  . 

i^GEORGE  3.  For  every  relationship  table,  get  Joined  Table 


UNIVERSITY 


31 


4.  Basic  MEBN  Learning 

Example  of  MEBN  Structure  Learning 


Vehicle  -  Location  -  Region 

obj 

rsn 

VehideTvpe 

Terra  inType 

vl 

A 

Winded 

Qffeoad 

v2 

r2\ 

T/acked 

bffRoad^ 

•  •  . 

/  ^ 

Vehicle  _Mf  rag 


PisA(otH.Vehicle)j 

[7isA(rgf 

wgwj 

( rgn  »  Locaoon(obi) ) 


TerrjinType(rgn) 


Veh«ieType(obi) 


Region_MFrag 


^  TerrainType(rgn) 


Loca 

ion 

t 

rgn 

vl 

tl 

rl 

vl 

t2 

rl 

vl 

t3 

rl 

v2 

tl 

r2 l 

v2 

t2 

r2 l 

v2 

t3 

r2_l 

GEORGE 
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4.  Link  between  Joined  entities 

5.  Add  context  nodes 
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4.  Basic  MEBN  Learning 

Example  of  MEBN  Structure  Learning 


GEORGE 


By  iteration  of  the  above  process,  the  above  MTheory  can  is  learned. 


UNIVERSITY 
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4.  Basic  MEBN  Learning 

Basic  MEBN  Structure  Learning 


Algorithm  1 :  Basic  Structure  Learning  For  MEBN 
Procedure  BSL_MEBN  (  DB.  II  Relational  database 

BNSL_alg  H  BN  Structure  Search  algorithm 
Sc  H  Maximum  size  of  chain 


) 

1  A/rtrory  •—  create  a  default  MTheory 

2  A/rf^rv  ♦—  add  entities  from  the  all  keys  in  the  tables  of  DB 

3  \£F rtf  * —  create  a  default  reference  MFrag 

4  for  i  =  1 ... .  until  size  of  all  tables  in  DB 

5  T,*—  get  table  from  DB 

6  G,  —  search  the  graphs  in  T,  using  BNSL_alg 

G,  —  revise  the  graph  to  ensure  no  cycle  and  undirected  edge 

8  if  G,  O  then 

9  MF,  =  createMFrag(G,,  T,.  Af^) 

10  for  c  =  1, ...  until  sc 

11  JT *—  joinTables(DB.  c) 

12  for  i  =  1,  ...  until  size  of  JT 

13  G,  - —  search  the  aggregating  graphs  using  FFS-LPD 

14  G,*—  search  the  graphs  in  JT,  using  BNSL_aIg 

15  G, «—  revise  die  graph  to  ensure  no  cycle  and  undirected  edge 

16  if  G,  ^  0  then 

17  for/  =  1, ...  until  size  of  G, 

18  if  any  nodes  in  G:J  is  not  used  for  any  MFrag  then 

1 9  MFref  <—  create  the  resident  node  with  the  name  of  JT,  on  MFrc/ 

20  createMFrag(G„  JT,.  A/l(l<w) 

2 1  else 

22  addEdges(G,,  JT,,  O) 

23  for  i  =  1 , . . .  until  size  of  all  resident  nodes  in  the  MTheory 

24  J,  «—  get  dataset  related  the  resident  node  i 

25  calculateLPD(R,  T,,) 

26  return  M^an 


GEORGE 


UNIVERSITY 


Structure  learning  is  to  organize  RVs  into  MFrags  and  identify  parent-child  relationships 
between  nodes,  given  a  dataset  expressed  in  RM 
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